04/01/21 PH
Demo version with correlations & covariance imaging (LW06 test dataset) between electron VMI data and ion ToF data. Demoed from pre-processed data, with additional filtering also applied.
For dev work see LW06_correlations_tests_291220.ipynb.
29/12/20 v2
Implemented covariance VMI analysis routines (includes filtering).
10/12/20 v1
Basic correlation coefficient for ion and electron channels (raw data, only gas on/off filtering).
So far: apply covariance (correlation coefficient) calculation to intensities (ion tof) and electron counts per shot.
Uses numpy.corrcoef (https://numpy.org/devdocs/reference/generated/numpy.corrcoef.html) to compute the matrices, which returns normalised covariance (correlation coefficients).
import numpy as np
from h5py import File
from pathlib import Path
# HV imports
import holoviews as hv
from holoviews import opts
hv.extension('bokeh', 'matplotlib')
import xarray as xr
# Memory profiler - OPTIONAL for testing
# https://timothymonteath.com/articles/monitoring_memory_usage/
# %load_ext memory_profiler
# %memit
# Quick hack for slow internet connection!
%autosave 36000
# Also image stacks can be problematic
showImgStacks = False
# Import class - this requires pointing to the `tmoDataBase.py` file.
# See https://github.com/phockett/tmo-dev
# Import class from file
import sys
sys.path.append(r'/cds/home/p/phockett/dev/tmo-dev/tmo')
import tmoDataBase as tb
# from vmi import VMI # VMI class - currently has multiple filtering, ish.
# from inversion import VMIproc # VMI processing class
from correlations import corr # Correlations class
tb.setPlotDefaults() # Set plot defaults (optional)
# tb.setPlotDefaults() # Set plot defaults (optional)
# Set main data path
baseDir = Path('/reg/data/ana15/tmo/tmolw0618/scratch/preproc/v5')
keys = range(81,87+1)
# Setup class object and which runs to read.
# data = VMIproc(fileBase = baseDir, runList = range(81,87+1), fileSchema='_v5')
# data = VMIproc(fileBase = baseDir, runList = keys, fileSchema='_v5') # Testing with single dataset
# data = VMI(fileBase = baseDir, runList = keys, fileSchema='_v5') # Testing with single dataset
data = corr(fileBase = baseDir, runList = keys, fileSchema='_v5') # Testing with multiple datasets
# The class has a bunch of dictionaries holding info on the data
# data.runs['files']
# Read in the files with h5py
# There is very basic checking implemented currently, just to see if 'energies' is present.
data.readFiles()
For e-i correlations use data in intensities variable, corresponding to ion tof bins in ts variable.
data.data[81]['raw']['intensities'].shape # Ion tof processed data, shots x M/Q bins
np.array(data.data[81]['raw']['ts']) # Ion tof bins
Ion tof bins in pre-processed data correspond to:
Possibly not the best binning, but will do for testing, and might be good to be blind here with no preconceptions of correlations. Bin 4 should be a good test of correlations, since it should result in a blank image.
# Simple filter - set for gas on only
# TODO: implement multiple filter sets here.
# data.setFilter(reset = True)
data.setFilter({'gas on':{'evrs':[1,1,70]}}, reset = True) # , 'gas off':{'evrs':[0,0,70]}})
data.filterData()
# Run corrRun() to calculate covariance matrices for each run
data.corrRun()
data.hmap
data.hmap.layout().cols(2)
# data.setFilter()
data.setFilter({'gas off':{'evrs':[0,0,70]}}, reset = True) # , 'gas off':{'evrs':[0,0,70]}})
data.filterData()
data.corrRun()
data.hmap
Still see some correlations here, although quite different - presumably residual background gas contributions. (Haven't checked carefully as yet, will also be clearer when channels are assigned.)
In the current implementation this will calculate covariances for electron (x,y) data with another data variable, e.g. ion intensities, on a per-shot basis. The routine uses numpy.corrcoef (https://numpy.org/devdocs/reference/generated/numpy.corrcoef.html) to calculate correlations for each (x,y) pixel with each other data dimension set (e.g. multiple ion species). It's quite slow, but amenable to massive parallelisation with some effort.
To make the code simpler, the data is converted to a Sparse array first (using the Sparse library), standard 3D array referencing is then possible with a minimal memory footprint. It also gives a nice array summary output.
# For testing, set filter first...
# Simple filter - set for gas on only
# TODO: implement multiple filter sets here.
data.setFilter(reset = True)
data.setFilter({'unsat':{'evrs':[1,1,70], 'eShot':[0,2000,0], 'desc':'Unsaturated signal, electron counts < 2000.'}}) #'gas on':{'evrs':[1,1,70]}, 'gas off':{'evrs':[0,0,70]}})
# Gas on/off, set event code #70. Unsaturated signal, check electron counts < 2000.
data.filterData()
data.filters
# Generate covariance VMI images.
# This takes a while (~30 mins for test dataset) with current implementation, which is looped over pixel and correlated datasets (default is 'intensities' data), and NOT YET PARALLELISED!
# Note that the default includes x4 downsampling on the (x,y) pixels, or this can be passed as a parameter as here.
# At x2 downsampling, this takes ~1 hour per run.
data.genCorrVMIXmulti(saveSparse=True, downsampleRatios = [2,2,1])
*** TESTING seems OK, but filters not working so far? Could be super() issue? Or overwriting data accidentally?
These are the (electron) VMI results for each run and ToF feature. Use the sliders to select.
# Show images
data.showImgSet(dims = ['xc','yc'])
# The data is stored in the imgStack parameter.
data.imgStack
# Quick save to disk...
# data.corrStack.to_netcdf("LW06_corrStack_test_040121.nc") # Fails on attribs
data.corrStackCP = data.corrStack.copy()
data.corrStackCP['unsat'].attrs = {}
data.corrStackCP.to_netcdf("LW06_corrStack_test_040121.nc") # Runs... 72Mb at 2,2 resample
# ds_disk = xr.open_dataset("LW06_corrStack_test_040121.nc") to load
# tmo-dev class version
data.__version__
import scooby
scooby.Report(additional=['holoviews', 'xarray', 'sparse'])